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1 Introduction 

In [4], Cohn and Elkies introduce linear programming bounds for the sphere 
packing problem, and use them to prove new upper bounds on the sphere 
packing density in low dimensions. These bounds are the best bounds known 
in dimensions 4 through 36, and seem to be sharp in dimensions 8 and 24, 
although that has not yet been proved. Here, we continue the study of these 
bounds, by giving another derivation of the main theorem of [4]. We then prove 
an optimality theorem of Gorbachev [8], and outline in some conjectures how 
the proof techniques should apply more generally. 

We continue to use the notation of [4]. See the introduction of that paper for 
background and references on sphere packing. 

The main theorem Cohn and Elkies prove is the following: 

Theorem 1.1 Suppose f: M" ^ M is a radial, admissible function, is not 



identically zero, and satisGes the following two conditions: 

(1) f{x) < for \x\>l, and 

(2) f{t)>Oforallt. 

Then the center densities of n-dimensional sphere packings are bounded above 



and admissibility means that there is a constant e > such that both |/(x)| 
and are bounded above by a constant times + More broadly, 

we could in fact take / to be any function to which the Poisson summation 
formula applies: for every lattice A C and every vector v eM."', 



However, the narrower definition of admissibility is easier to check and seem- 
ingly suffices for all natural examples. 

Section 2 gives another proof of Theorem 1.1, for n > 1. This proof is not 
as simple as the one in [4], but the method is of interest in its own right, as 
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Here, the Fourier transform is normalized by 
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are some of the intermediate results. Section 3 proves Gorbachev's theorem [8] 
that certain admissible functions (those constructed in Proposition 6.1 of [4], 
or independently by Gorbachev) are optimal, among functions whose Fourier 
transforms have support in a certain ball. Finally, Section 4 discusses the dual 
linear program, and puts the techniques of Section 3 into a broader context. 
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2 Positivity of theta series coefficients 

We will prove Theorem 1.1 using the positivity of the coefficients of the theta 

series of lattices. For each lattice, the theta series of its dual must have posi- 
tive coefficients, and these coefficients are some transformation of those for the 
original lattice. This puts strong constraints on the theta series of a lattice, 
which we exploit below. For simplicity, we will deal only with the case of lat- 
tice packings, but everything in this section applies to all sphere packings, by 
replacing the theta scries of a lattice with the average theta series of a periodic 
packing (see [5, page 45]). Also, for technical reasons we will deal only with the 
case n > 1, which is not a serious restriction as 1 -dimensional sphere packing 
is trivial. 

Unfortunately, carrying this program out rigorously involves dealing with a 
number of technicalities. If one simply wants an idea of the overall argument, 
without worrying about rigor, one can follow this plan: Ignore Lemma 2.4 and 
all references to Cesaro sums, and assume that all Laguerre series converge. 
Ignore the uniformity of convergence in Lemma 2.6 (in which case the proof 
becomes far simpler). Ignore the justification of interchanging the sum and 
integral in Lemma 2.7. Following this plan will of course not lead to a rigorous 
proof, but it may make the underlying ideas clearer. 
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Before going further, we need a lemma about Laguerre polynomials. Let be 

the Laguerre polynomial of degree k and parameter a > — 1. These polynomials 
are orthogonal with respect to the weight x°'e~^ dx on [0, oo) . 

Lemma 2.1 For every non-negative integer k, a > —1, and y G M, we have 

(-1)^' d'' 



k\ du^ 



Proof This is easily proved by induction, using standard properties of La- 
guerre polynomials (see Section 6.2 of [1], or Sections 4.17-4.24 of [10]). □ 

Suppose A C is a lattice, and define a measure fi on [0, oo) consisting of a 
point mass at x for each vector in A of norm x, where the norm of v is {v,v) . 
The purpose of ji is to allow us to sum over all lattice vectors without having 
to index the sum in our notation; instead, we simply integrate with respect to 
/i. Although /i depends on A, for simplicity our notation does not make that 
dependence explicit. 

The key positivity property of jjb is the following lemma: 
Lemma 2.2 For all y > and all non-negative integers k, 

L^/'-'(xy)e-^^(i//(x) >0. 



/ 

Jo 



10 

Proof The theta series of A is given by 



/•oo 

@a{z)= / e'^^'dfi{x) 
Jo 



and it follows from the Poisson summation formula that the theta series of the 
dual lattice A* is given by 

eA*(z) = voi(M7A) (^^Y^Oa (-^ 

(See equation (19) in [5, page 103].) 

It will be more convenient for us to work with the variable y given by y = —mz. 
Let T(ti) = Q{^{z), so that 

/•oo 

T{y)= / e-^ydti{x). 
Jo 

Then up to a positive factor, the theta series of A* is given by y~"'/^T(7r^/j/) . 
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We know that y~"/^T(7r^/y) is a positive linear combination of functions e"'^^ 
with c > 0, because it is the theta series of a lattice (times a positive constant). 
Hence, its successive derivatives with respect to y alternate in sign. We have 

/•oo 

Jo 

from which it follows using Lemma 2.1 that 

(Differentiating under the integral sign, which really denotes a sum, is justified 
by uniform convergence of the differentiated sum; see Theorem 7.17 of [13].) 
Now the change of variable y ir^/y shows us that 

f'OO 

/ L^/'-\xy)e-"^d/x(x) >0, 
Jo 

as desired. □ 

When we use only the fact that the derivatives of j/^"/^r(7r^/y) alternate in 
sign, we do not lose much information — by a theorem of Bernstein (see Section 
12 of Chapter IV of [22]), this property characterizes functions of the form 

POO 

/ e-'^y dn{x) 
Jo 

for some measure fi on [0, cxd) . Also, it is not surprising that the inequalities in 
Lemma 2.2 occur for all scalings y, because so far our setup is scale-invariant. 

If the shortest non-zero vectors in A have length 1 (that is, A leads to a packing 
with balls of radius 1/2), then the center density of the lattice packing given 
by A equals 

(47r)-"/2 lim y"/^r(y). 

y-^0+ 

The proof is as follows. The relationship between the theta series of A* and A 
is 



2"vol(]R"/A) 

As we let y ^ oo, the right hand side becomes the limit above, and the left 
hand side tends to 1/(2" vol(M"/A)) , which is the center density. 

Using Lemma 2.2, we can bound the center density. First, we need a definition 
and a lemma. 
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Definition 2.3 A function /: [0, oo) — M has the a-SILP property ("scale- 
invariant Laguerre positivity") if the following conditions hold: 

(1) /is continuous and for some e > and C > 0, we have 

\f{x)\<C{l + \x\)-^-'-' 

for all X, and 

(2) for every y > 0, the Laguerre scries 

^aj{y)L'^{x), 

for x^ f{x/y) has aj{y) > for all j. 

Condition (1) is merely a technical restriction; condition (2) is the heart of the 
matter. Notice that the orthogonality of the Laguerre polynomials implies that 

_ /o°^/(x/y)L^(x)a;"e-^dx _ f{x/y)L'^{x)x°'e-'' dx 
"^■^^^ ~ L'^{x)^x^e-^dx ~ r(j + Q + l)/j! ■ 

We make no assumption about convergence for the Laguerre series in Defini- 
tion 2.3. However, the following analogue of Fejer's theorem on Fourier series 
holds. It is a simple consequence of results in [20]. We could also make use 
of [16] to prove a marginally weaker result (which would still suffice for our 
purposes). 

Lemma 2.4 Let a > 0, and let f : [0, oo) — ^ M be an a-SILP function. Then 
for all k > a + 1/2, the (C, k) Cesaro means 



m I ^ \ m — 1 



of the partial sums of the series 

converge uniformly to f{x/y)e~^^'^ on [0,oo), as m — oo. (Here, aj{y) is as 
above.) 

Proof We take y = 1 for notational simplicity; of course, the same proof holds 
for each y > 0. For a function g: [0, oo) —>■ M, let g{x) = g(x)e~^/'^ , and let 
Cm9(,x) denote the Cesaro mean 
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where g has Laguerre coefficients bj . Theorem 6.2.1 of [20] says that there exists 
a constant C such that for ah m and all g such that g G L°°{[0,oo),x°' dx) , 

\Wm9\\oc < C\\g\\ 
where || • ||oo denotes the norm on L°°{[0,oo),x°' dx) . 

We can then imitate the proof of Theorem 2 in [12]. Let £ > 0. By Theorem 18 
of [17], / can be uniformly approximated on [0, oo) by g with g a polynomial. 
Choose g so that 



f-~g 



< 



Then 



oo 2 + 2C 
Ce 



2 + 2C 



For sufficiently large m, we have 



Wm9 — 5l|oo < 



2' 



since 5 is a polynomial. It follows that 

00 

Thus, amf tends uniformly to / as m — 00 . 



Of course, this proof made no use of the positivity of the Laguerre coefficients, 
and in fact could be carried out with far weaker constraints on the behavior 
of / at infinity. We stated it in terms of ct-SILP functions only because those 
are the functions to which we will apply it. The requirement that a be non- 
negative is part of the hypotheses of Theorem 6.2.1 of [20]. Perhaps one could 
prove an analogue of Lemma 2.4 for a < 0, but in terms of sphere packing that 
would cover only the one-dimensional case. 

Theorem 2.5 Let n > 1. Suppose f has the (n/2 — 1)-SILP property, with 
/(O) = 1 and /(x) < for X > 1. Then the center density for n -dimensional 
lattice packings is bounded above by 

r(n/2) 

2"W2/Q°°/(x)W2-ldx' 

As was pointed out above, the same bound holds for all sphere packings, not 
just lattice packings. One can prove this more general result by replacing the 
theta series of a lattice with the averaged theta series of a periodic packing in 
Lemma 2.2, but for simplicity we restrict our attention to lattices. 
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Proof Without loss of generality, we can assume that our lattice is scaled so 
as to have packing radius 1/2 (that is, every non-zero vector has norm at least 
1 ) . Define fi, T, ak{y), etc. as before. 

We have 

poo 

/(0)> / f{x)e--ydfx{x), 
Jo 

since all contributions to the integral from x > are non-positive. 
Let k > {n- l)/2, and 

Then 

roo roo 

/ a^/(x)e-^^/2 dfi{x) > ao(y) / e"^^ d/x(x) = ao{y)T{y), 
JO Jo 

since by Lemma 2.2 all the terms in (Tmf{x) with j > contribute a non- 
negative amount. Since Umfix) converges uniformly to /(a;)e~^2//2 as m — oo 
by Lemma 2.4 (and because constant functions are integrable with respect to 

e~^y/'^ diJ,{x)), we have 

poo poo 

lim / amf{x)e-''y/^dn{x) = f{x)e-''y dfi{x). 

m-^oojQ Jq 

It follows that 

poo 

/ f{x)e--ydii{x)>ao{y)Tiy), 
Jo 

and hence 

/(O) > ao(y)T(y). 
Thus, the center density is bounded above by 

y^o+ (47r)"-/2ao(y)' 
We can evaluate that limit, since 

" r>72) " r>72) ' 

and Jq°° f[u)u"^'^~^e~y^ du converges to Jq°° f{u)u^/'^~^ du as y ^ 0+, by dom- 
inated convergence. Applying this formula leads to the bound in the theorem 
statement. □ 
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Theorem 2.5 amounts to essentially the same bound as Theorem 1.1, although 
that is not immediately obvious. The key is Proposition 2.8, which tells us that 
there is essentially only one a-SILP function for each a, in the sense that every 
a-SILP function is a positive combination of scalings of this function. First, 
we need two technical lemmas. 

Lemma 2.6 For a> —1/2 and x G [0,oo), 



-x/k _ 

and convergence is uniform over [0, oo 



lim k-''Ll{x/k)e-''l^ = a;-"/V«(2V^), 

fc— >oo 



Note that uniform convergence is false for a = —1/2, because k'''^L'^{x/k)e~^/^ 
tends to as .x ^ oo but the right side does not. Since we take a = n/2 — 1 
in dimension n, the only case this rules out is the trivial 1-dimensional case, 
and that is hardly a problem since it is already ruled out by Theorem 2.5 (via 
Lemma 2.4). 

Proof Pointwise convergence is known (see 10.12 (36) in [7, page 191]), but 
the statements the author knows of in the literature omit the e~^/^ factor that 
makes the convergence uniform. 

We consider two cases. In the first, x > k^''^^ with S > fixed as k —>■ oo. Then 

x~°'^'^ Ja{2^/x) tends uniformly to as A; ^ oo, and wc just need to verify that 
k~°'L'^{x/k)e~^^'' does as well. For that, we use Theorem 8.91.2 from [19]. It 
implies that for a > 



max 

x>a 



^I^LUx) =0{k 



where C = max(-l/3, q;/2 - 1/4). It follows that A;-"L^(x/A;)e-^/'= tends 
uniformly to as /c ^ oo with x > k^^^ . 

Thus, we need only deal with the case of a; < k^^^ . We start with (4.19.3) 
from [10] (which holds for all a > —1, not just a > 1 as inadvertently stated 
in [10]), which says that 

Ll{x) = —— / f'=+"/2j^(2V^)e-*dt. 

fc' Jo 

Thus, 

-a/2r.fe+l poo 

fc-"L?(x/A:)e-^/*^ = -f / J„(2V^)e*^('°§*-*) 

= (l + o(l))eS/— / (i/x)°^/V„(2v/^)e'=('°st-t)rfi. 
V 27r Jo 



Qeometry & Topology, Volume 6 (2002) 



338 



Henry Cobn 



The exponent log t—t is maximized at t = 1 , so we can use the Laplace method 
to estimate this integral (see Chapter 4 of [3]). In the following calculations, 
all constants implicit in big-0 terms are independent of x. 

Let e > be small (e will be a function of A:). Our integral nearly equals that 
over the interval [1 — e, 1+e] , since for any C < 1/2 we have log t — t < —1 — Ce^ 
outside [1 — £, 1 + e] for sufficiently small e, and hence 

JQ Jl-e 

is bounded by 

(fc_l)(l+C£2) 



Jo V ; 



Thus, we just need to estimate 

fl+e 



li 



n-e 

We would like to approximate it with 



e 

■e 



k(logt-t) 



The difference between these integrals is bounded by a constant times the 
product of £, the maximum of the t-derivative of (i/x)"/^ Jci(2-\/xi) over t G 
[1 — £, 1 + £] , and 

fl+£ 

gfc(logt-t) d^ 

ll-e 

We have 

9 /.a/2 T /o ^ZI^^ ".a/2-1 r ^ZI^ . f t /o /Tlx , a^(2\/^)\ 



-r/V«(2V^)) = V/2-ij„(2V^) + ('-J,+i(2V^) + 
Ci ^ \ 2vajt / Vxt 



For X near remains bounded; for x away from it 

is at most 0{x^^^^°'^'^), which is at most 0{x^^'^^^) if 6 is small enough relative 
to a (which we can assume). Because x < k^^^ , we have x^^'^~^ < k}/'^~^/'^ . 

Thus, 

/ \t/x)'''^Jo,{2^/^t)e^^^''^'-^Ut 

Jl-e 

equals 

(x-/2j,(2V^) + O [ek^/^-'/^)) e*^(i°s*-*) dt. 
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If we expand logt - t = -1 - {t - 1)^/2 + 0((i - 1)^) , we find that 

= (1 + o(l))e-'=y^, 

as long as fee^ ^ oc , so that the interval we are integrating over is much wider 
than the standard deviation of the Gaussian we are using to approximate the 
integrand. 

So far, we know that as long as ke'^ oo, we have 

A;-"L^(x/A;)e-^/'= = (l + o(l))x-"/2 j^(2Vi) + (Vfee"^^^') +0 (eifcV^-Vsj _ 

Now if we take e = A;-^ with (1 - (5)/2 < /? < 1/2, we find that 

k-'^Ll{x/k)e-'''^ = x-°^/^Ja(2Vi) +o(l), 
as desired. □ 

Lemma 2.7 For a > —1/2, if f : [0, oo) ^ M is continuous and satisfies 

|/(x)|<C(l + |:c|)-"-i-^ 
for some C > and e > 0, then 

poo poo 

V t'^ / f{x/y)Lt{x)x''e-- dx = {I - t)—' / f{x/y)x-e--/^^-'^ dx 
whenever \t\ < 1/3. 

Proof We would like to convert this sum to 

POO 

/ 5;/(x/y)L^(x)x«e--t^dx 

•^0 ifc>0 

and apply the generating function 

J2Lk{x)t'' = (1 - t)-"-ie-^*/(i-*) 

A;>0 

((6.2.4) from [1]). To do so, we must justify interchanging the limit with the 
sum. 

Let 

g{t) = (1 - t)-"-ie-^*/(i-*) = (1 - t)-"-ie^e-^/(^-*). 
Then the Lagrange form of the remainder in Taylor's theorem implies 



, a'^"'' (s) 
git) = LU^)t' + -^t"" 



k=0 
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for some s satisfying \s\ < \t\. By Lemma 2.1, 



ml 



It follows from Lemma 2.6 that 

e-^/(^-')L^{x/{l - s)) < Cm'' 

for some constant C" > (depending on a). Thus, 



poo 1 roo 

1 - t)-^-' / /(x/y)x-e--/(i-*) f{x/y)LUx)x-e-- dx 

Jo k=o •'^ 



is bounded above by 



C ( j f{x/y)x'' dx]{l- s)-"-im" 



1 - s 



(2.1) 



The integral in (2.1) is finite because of the bound on |/| in the lemma state- 
ment. Because \t\ < 1/3 and |s| < we have 



1 - s 



and hence (2.1) tends to as m — oo. 



1 

<2' 



Proposition 2.8 Let a > —1/2, and suppose /: [0,oo) M is continuous, 
and satisGes \ f{x)\ < C{1 + |x|)~"^~^~^ for some C > and e > 0. TJien / has 
the a-SILP property iff 

POO 

f{x)= / ixyy^/^U2V^)dgiy) 
Jo 

for some weakly increasing function g . 



Note that one can compute directly the Laguerre coefficients of the scalings of 
j^^2y^) and verify that they are positive (see Example 3 in Section 4.24 
of [10]). Proposition 2.8 tells us that this function is essentially the only a-SILP 
function. 

Proof We know that / has the a-SILP property iff for every y > 0, 



t>0 ^» 



oo 

/ f{x/y)LUx)x''e-''dx 
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has non-negative coefficients as a power series in t. By Lemma 2.7, we can 
write this function (for small t) as 



(1 - t)-"-^ / /(x/y)x'^e-^/(^-*) dx, 
Jo 



10 

which is a positive constant (a power of y) times 



/•oo 

Jo 



Define / to be the Laplace transform of x <—>■ x°'f{x). Then / has the a-SILP 
property iff 

(1 - t)-^-'f{y/{l - t)) 

has non-negative coefficients as a power series in t . We can rescale t by a factor 
of y and pull out a power of y to see that this happens iff 

{l/y-tr''-'f(l/{l/y-t)) 

has non-negative coefficients. That happens for all y > iff the function 
u I— u~"~^f{l/u) has successive derivatives alternating in sign (the function 
is non-negative, its derivative non-positive, its second derivative non-negative, 
etc.). By Bernstein's theorem (Theorem 12b of Chapter IV of [22, page 161]), 
this holds iff it is the Laplace transform of a positive measure. 

Thus, we have shown that / has the a-SILP property iff there is a weakly 
increasing function g such that for ti > 0, 



u 



roo poo 

Jo Jo 



To finish proving the proposition, we can work as follows. We know that 



POO roo 

/ /(x)x'^e-^"dx = u-"-^ / ( 

^0 ^0 



dg{y)- 



We can now apply the following general theorem for inverting a Laplace trans- 
form: if 



poo 

i)= V( 

Jo 



x)e "^"da;. 



then 



wherever il) is continuous. (See Corollary 6a.2 of Chapter VII in [22, page 289]. 
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We can apply this to our equation, and differentiate under the integral sign 
(justified since the differentiated integrals converge uniformly as u ranges over 
any compact subset of (0, oo) ; see Theorem 14 of Chapter 10 in [23, page 358]). 
Using Lemma 2.1, it follows that 

To finish the proof, wc apply Lemma 2.6, but we need to check that passage to 
the limit under the integral sign is justified. Because of the uniform convergence, 
it is justified as long as constant functions are integrable with respect to dg. 
However, that is true, for the following reason. By definition, g satisfies 

pOC roc 

/(x)x"e-^/"(ix= / e-y^dgiy), 
Jo Jo 

which is equivalent to 

POO POO 

/ f{ux)x°'e~^dx= / e~^'^dg{y). 
Jo Jo 

When we let u — 0+ , the left side converges to 

POO 

/(O) / x"e-^dx 
Jo 

(by the dominated convergence theorem: recall that / is bounded and contin- 
uous), so the right side converges as u ^ 0+. By monotone convergence, we 
see that constant functions are integrable with respect to dg , which is what we 
need. □ 

Corollary 2.9 For integers n>l, a function f: [0, oo)— has the (n/2— 1)- 
SILP property iff the function from M."^ to R given by x f {\x\'^) is continuous, 
satisfies 

\f{\xf)\<C{l + \x\)-^-^ 

for some C > and e > 0, and is the Fourier transform of a non-negative 
distribution. 

Corollary 2.9 follows from combining Proposition 2.8 with Theorem 9.10.3 of [1] 
(see Proposition 2.1 of [4]), after some changes of variables. Using Corollary 2.9, 
one can check with some simple manipulations that for n > 1, Theorem 2.5 
implies Theorem 1.1 for lattice packings (and, as pointed out above, the gen- 
eral case can be proved similarly). It is seemingly more general, because it 
does not constrain the Fourier transform at infinity. However, the additional 
generality does not seem useful, and one could likely generalize the proof in [4] 
to use a version of Poisson summation with fewer hypotheses (for example, see 
Theorem D.4.1 in [1]). 
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Corollary 2.10 For a > —1/2, the product of two a -SILP functions is always 
an a-SILP function. 

Corollary 2.10 follows immediately from Corollary 2.9 when a = n/2 — 1 with 
n G Z, and can be proved for arbitrary a using Proposition 2.8 together with 
13.46 (3) of [21] or (7) from Section 3 of [18]. It seems surprisingly difficult to 
prove directly from the definition of a SILP function: it would follow trivially 
if the product of two Laguerre polynomials were a positive combination of 
Laguerre polynomials, but that is not the case. In fact, the coefficients of such 
a product alternate in sign; that is, the polynomials (— 1)'^L^ have the property 
that the set of positive combinations of them is closed under multiplication. 

3 Optimality of Bessel functions 

Let ji, denote the first positive root of Jy. According to Proposition 6.1 of [4], 
the function /: ^ M defined by 

/(X) = fffiift (3.1) 

(1 — 

satisfies the hypotheses of Theorem 1.1, and leads to the upper bound 

Jn/2 

(n/2)!24" 

for the densities of n-dimensional sphere packings. The Fourier transform / 
has support in the ball of radius j„/2/^ about the origin. We will show that 
among all such functions, / proves the best sphere packing bound. This is 
analogous to a theorem of Sidel'nikov [15] for the case of error-correcting codes 
and spherical codes. It was first proved in the setting of sphere packings by 
Gorbachev [8]. Our proof will be based on the same identity as Gorbachev's, 
but the proof of the identity appears to be new. 

For notational simplicity, we view / and / as functions on [0, oo) ; that is, /(r) 
will denote the common value of / on all vectors of length r. Let = n/2 — 1 , 
and let Ai < A2 < ■ ■ ■ be the positive roots of J^+i {x) (equivalently, the 
positive roots of —vJi,{x) + xJ^{x); see equation (4) in Section 3.2 of [21]). 
Define Br{x) to be the closed ball of radius r about x. 

Our main technical tool is the following identity due to Ben Ghanem and Frap- 
pier (the p = case of Lemma 4 in [2]), who state it with weaker technical 
hypotheses and a different proof. 
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Theorem 3.1 (Ben Ghanem and Frappier [2]) Let /: — M be a radial 
Schwartz function. If supp(/) C -6^(0), then 

m = ^^^/(o) + y '-^ / (^] 

J ^ I T^nji^n >^ Z^^ (n/2- l)!W2r"J„/2_i(A„)2"^ V^rr y ■ 

We will postpone the proof of Theorem 3.1 until we have developed several 
lemmas. First, however, we deduce the desired optimality: 

Corollary 3.2 (Gorbachev [8]) Suppose f : M" — > M is a radial admissible 
function, is not identically zero, and satisfies the following three conditions: 

(1) f{x) < for |x| > 1, 

(2) f{t)>Oforallt,and 

(3) supp(/)CS^.^^^/,(0). 
Then 

7r"/2 /(O) ^ i:/2 



(n/2)!2" /(o) ~ (n/2)!24"' 

Proof of Corollary 3.2 Let r = j„/2/7r. If / were a Schwartz function, then 
Theorem 3.1 would imply that 

7r"/2(j„/2/7r)" 

since Am/(7rr) > 1 for m > 1. For more general functions /, the series 
(n/2)!2- ^ 4A^r! f 

at least still converges, since the terms are 0(m~^~'^) for some e > (namely, 
the £ from the definition of admissibility); to verify this, note that grows 
linearly with m, and that Ju{z)'^ + Ji,+i{z)^ ~ Ijij^z) (see Section 7.21 of [21, 
page 200]), so Ji/{Xm)^ ~ '^/{T^^m)- However, we must verify that it converges 
to /(O). 

We need to smooth / without increasing its support. Let is denote any non- 
negative, smooth function of integral 1 with support in the ball of radius 5 
about the origin. Let fe{x) = f{x{l — £))'irs/2{x) , where r = jn/2/'^- This is 
a Schwartz function whose Fourier transform has support in the ball of radius 
r(l — e/2), so Theorem 3.1 applies to /g. As e ^ 0+, the functions /e and 
/e converge pointwise to / and /, respectively. Since | Ve/21 < 1 everywhere, 
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dominated convergence lets us interchange the Umit as e — 0+ with the sum 
over m to conclude that 

/(o) = i!^/(o) + f '-^ / (^) 

J ^ ' T,nl2^n 1^ Z^^ (n/2- l)!W2r»J„/2-i(A„)2-^ \TTr )' 
and we finish the proof as before. □ 



Lemma 3.3 Let /: — > R be a radial Schwartz function. If supp(/) C 

Br{0), then for u G [0,1), 

The same holds even if f is not smooth at radius r (but is left continuous at 

radius r , and still smooth at all smaller radii), as long as the values of f in the 
sum decrease faster than any power ofl/m as m— >oo. 



Note that if / is a Schwartz function, then the condition on the decay of the 
values of / automatically holds. 



Proof Because supp(/) C Br{0), we have 

x^f{x)= I g{u)u'^~^^ Ji^{2Trrux) du, 
Jo 

where g{u) = 2tt f {ru)r'^^'^ (see Theorem 9.10.3 of [1], or Proposition 2.1 of 
[4]). We begin by expanding g{u)vy into a Dini series. For a quick introduction 
to Dini series, see [10, page 130]. Unfortunately, for a technical reason that 
reference does not cover the case we need here (see footnote 33 on page 130). For 
a more thorough reference, which covers everything we need, see Sections 18.3- 
18.35 of [21]. In Watson's notation, we are dealing with the case H + u = (see 
page 597 of [21]). Convergence of the Dini series to g{u)vy for u G (0, 1) follows 
from standard results (see pages 601-602 of [21]), and at « = it follows from 
continuity of 3 at and uniform convergence of the Dini series (which itself 
follows from the decay of f{Xm/{T^r))). 

The Dini series expansion of g{u)vy is 

„i 00 
g{u)u'' = 2{y + l)u'' / t''+^g{t)t'' dt+J^ bmM>^mu), 
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where 



Note also that 



hm C g(u)v!'+^JJ2iTrux)/x'' du= f\(u)u''+^ ^^^^^ du, 

^^o7o Jo r(z/ + i) 



since as a; — 0, 



2'^r(i^ + 1) ' 

so 

fi 



Ct^+^g{tydt = J{Q)T{v + l)/{', 
Jo 



Furthermore, XmJU^m) = i^Jv{Xm), so 
Thus, 



as desired. 



Lemma 3.4 Let f be a function from [0, oo) to M. TJie function x i— 

from M" to M is the Fourier transform of a compactly support distribution iff 

f extends to an even, entire function on C tiiat satisfies 

\f{z)\ < C(l + |z|)*^e^'l^'"^l 

for some C , C , and k. 



Proof This lemma is essentially a special case of the Paley- Wiener-Schwartz 
theorem (Theorem 7.3.1 in [9]). The only difference is that the general theorem 
is not restricted to radial functions, and characterizes Fourier transforms of 
compactly supported distributions as entire functions g oi n complex variables 
satisfying 

\9{zi ,...,zn)\<c(l + VkiP + --- + knP) e^V(I--l)'+•••+(I--n)^ (3.2) 
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The only subtlety in deriving the lemma from the general theorem is in showing 
that if / satisfies the hypotheses above, then the function g defined by 

g{zi, ...,Zn) = f (^^zl + ---+zlj 

satisfies (3.2). To do that, the elementary inequality 

< V(Im2:i)2 + --- + (Imz„)2 



lm.Jz( + 



+ zt 



can be used. To prove that inequality, one can use induction to reduce to the 
n = 2 case, and prove that case by direct manipulation of both sides. □ 



Now we are ready to prove Theorem 3.1. Notice that it says that to determine 
the integral of /, we need only half as many values as we need to reconstruct 
the whole function via Lemma 3.3. This phenomenon is analogous to Gauss- 
Jacobi quadrature (sec Theorem 14.2.1 of [6]). The proof given below is in fact 
modeled after the proof of Gauss-Jacobi quadrature, although carrying it out 
rigorously is more involved. 



Proof of Theorem 3.1 Let £ > 0, and define h: [-1, 1] ^ R by 



oo 2 



h{u) = , /(o) + y 



Am 

27r(r/2+£) 



^ (27r(r/2+£)) Jv{y^mU) 



(7r(r/2 + £))-•' ^ ^ MX. 



ml 



\2 



(The functions Jy{\„iu) /u'^ arc even, so this is no different from defining h 
on [0, 1] .) Since / is a Schwartz function, the values of / in the series above 
decrease quickly enough that it defines a (7°^ function on (—1, 1). Define h by 

„ , ,,,,9 \h{u) if Inl < 1, and 

27r/i((r/2 + e)u){r/2 + sY^^ = i ' ' . 

ID otherwise, 

and define h to be the Fourier transform of h. Then supp(/i) C 5^/2+2(0) • 
By Lemma 3.3, combined with uniqueness for Dini series (which follows from 
orthogonality), we have 

, / Am \ „ / A,^ 



27r(r/2 + e)y ' \27r{r/2 + e) 

for all m, and h{0) = /(O). (Note that h may not be smooth at radius r/2 + e, 
but that does not violate the hypotheses of Lemma 3.3.) 
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Now let Xr denote the characteristic function of a bah of radius R about the 
origin, so that 

Xr{x) = Jn/2{27rR\x\){R/\x\r/\ 

The entire function f — h has roots wherever Xr/2+s does, and Xr/2+e 
single roots, so the quotient g = {f — h)/Xr/2+e entire. 

We would like to conclude that g is the Fourier transform of a compactly 

supported distribution. By Lemma 3.4, this requires bounds for g, and it is not 
obvious that dividing by a Bessel function does not ruin the bounds. We prove 
this in two steps. First, Lemma 1 of [11] implies (after rescaling variables) that 

C2\lmz\ 

I n/2\ )l I - (1 + |2|)C3 

whenever |Im2;| > C4, for some constants ci, 02,03,04, with ci > of course. 
That means that dividing by it does not mess up our bounds when the absolute 
value of the imaginary part is at least C4. The second step is to deal with 
points near the real axis. Consider a box with sides on the lines with imaginary 
part ±C4 and real part ±(/c7r + (7rn + l)/4), where is a positive integer. By 
the maximum principle, the maximum of g over the interior of the box must 
occur on the sides. We know that g satisfies the bound we want on the top and 
bottom, and g is even, so we only need to estimate g on the right side. 

For z in the right half-plane, we have 

Jnl2(.z) = (^eos - ^) (1 + 0(l/.2)) 

+ sin (z-^^^) {0(1/ z)) 



(see (1) in Section 7.21 of [21]). When z has real part kir, we have cos(2;) = 
(— 1)^ cosh(Im z) , which has absolute value at least 1. Thus, on the right side 
of the box, the cosine factor is always at least 1 . The sine factor is bounded, 
because Im z is bounded, so we see that on the right side of the box (^) / ^"''^ 
is never smaller than a power of l/]^] . 

When we combine these estimates, it follows from Lemma 3.4 that g is the 
Fourier transform of a distribution with compact support. Furthermore, the 

Titchmarsh-Lions theorem (see Theorem 4.3.3 in [9]) implies that the convex 
hull of the support of f — h equals the Minkowski sum of those of g and Xr/2+e ' 
so supp(^) C B^/2-e(.0). 
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Let is denote any non-negative, smooth function of integral 1 with support in 
the ball of radius S about the origin. We have 

fis - his = {Xr/2+^5)9- 

Now both sides are integrable functions (note that this is not obviously true of 
either h or gXr/2+e ' which is why we had to multiply by ) , and we find that 

(/ * ism - $ * ism = j {Xr/2+e * ^5)?- 

Because supp(^) C Sr/2-e(0), if we take 5 < e we have 

jiXr/2+e * is)g = J9 = 5(0) = 0, 

where ^(0) = because /(O) = h{0) . Thus. 

if*ism = ih*ism- 

If we let S 0+, we find that /(O) = h{0) , because both / and h are contin- 
uous near 0. It follows from the way h was defined that /(O) equals 

(n/2)!2- ^ 4X^ f A, 



^n/2(^ -h ley ^ ' (n/2 - l)!W2(r + '2eYJn/2-i{^mf KT^ir + 2e) 

Now sending e ^ 0+ proves the desired result, by dominated convergence. 



4 The dual program 



It is natural to view choosing the optimal function / in Theorem 1.1 as solving 
an infinite-dimensional linear programming problem: if we fix /(O) = 1, then 
we are trying to minimize the linear functional /(O) of /, subject to linear 
inequalities on /. The technicalities are slightly subtle; for example, it is not 
immediately clear what the right space of functions to consider is (admissibility 
might be too ad hoc). It seems likely that Schwartz functions suffice. One 
can come arbitrarily close to the optimum with functions / such that / and 
/ are smooth and rapidly decreasing, where we say g: M" ^ M is rapidly 
decreasing if g{x) = 0{{1 + |2;|)~^) for every k > 0: given any / that satisfies 
the hypotheses of Theorem 1.1, let 

Mx) = {if*i,*ie)%^)i{l + 2S)X), 
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where is is any non-negative, smooth function of integral 1 with support in 
Bs{0) ■ Then has the desired properties, still obeys the required inequalities, 
and satisfies 

Presumably Schwartz functions also come arbitrarily close, but one would have 
to worry about making the derivatives rapidly decreasing as well. Despite the 
fact that rapidly decreasing functions come close to the optimal bounds, it is 
not clear whether they reach them. For example, even for n = 1, where one 
can write down several explicit functions that solve the sphere packing problem 
(see Sections 3 and 5 of [4]), these functions are not rapidly decreasing. 

In this context, it is natural to study the dual linear program, to prove bounds 
on how good the sphere packing bounds produced by Theorem 1.1 can be. The 
results of Section 3 amount to doing exactly this, for a restricted linear program 
in which we limit the support of /. Unfortunately, in the unrestricted case the 
dual program seems no easier to solve in general than the primal program is. 
However, it leads to several intriguing open problems. 

One formulation of the dual program is as follows: find the largest c such that 
there is a tempered distribution g on satisfying 

(1) g = S + h with h>0, 

(2) supp(/i) C {x : \x\ > 1}, and 

(3) 9>cd. 

Here 5 is a delta function at the origin, and inequalities between distributions 

mean that applying both sides to non-negative functions preserves this inequal- 
ity. For g satisfying (l)-(3) above, and any radial function / satisfying the 
hypotheses of Theorem 1.1 such that / and / are rapidly decreasing, we have 



/(0)> f9= /5>c/(0). 

Jm." j]R" 

Here, we use the fact that one can apply a non-negative tempered distribution 
to any rapidly decreasing function, because non-negative tempered distributions 
are exactly measures /x such that 

dn{x) 



I 



< GO 



in (1 + \X\ 

for some k (see Theorem VH in Chapter 7, Section 4 of [14, page 242]). Thus 
— The duality theorem of linear programming suggests that there 
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is no gap between the smallest /(0)//(0) and largest c, but it is not clear how 
to prove it in this infinite-dimensional setting. 

Given any lattice A with minimum non-zero vector length 1, summing over 
A defines a tempered distribution that clearly satisfies properties (1) and (2), 
and Poisson summation implies that it has property (3) as well. As is the 
case for the functions /, we can rotationally symmetrize g, so that g and g 
are positive linear combinations of spherical delta functions, where we define 
a spherical delta function 6r on to be a distribution with support on the 
sphere of radius r about the origin, such that integrating any function times 
5r gives the average of that function over the sphere. One would expect that 
the optimal radial g should always be a linear combination of spherical delta 
functions, but it is not clear how to prove it. Aside from the origin, g and g 
should be supported on the zeros of the optimal / and /, respectively, but why 
must these zeros even occur at a discrete set of radii? 

Open Question 4.1 Consider tempered distributions g such that g and g 
are linear combinations of spherical delta functions. Is every such distribution 
in the span of the rotationally symmetrized Poisson summation distributions? 

It seems very unlikely that the answer to Question 4.1 is yes. Any counterexam- 
ple would be of interest, since the optimal distributions g in most dimensions 
(not 1, 2, 8, or 24) are probably counterexamples. 

One interesting case is 72 dimensions. It is an open question whether there 
exists an "extremal lattice of Type 11" in , in other words, an even unimod- 
ular lattice in M'^^ with minimal non-zero norm at least 8 (see [5, page 194] 
for more details). Such a lattice might be as extraordinary as Eg or the Leech 
lattice. Unfortunately, it seems unlikely that one exists. However, its existence 
cannot be ruled out by Theorem 1.1. The simplest way to see that is in light of 
Section 2. A proof that the lattice did not exist would amount to a proof that 
its theta series could not exist. However, although the extremal lattice may 
not exist, there is a modular form that would be its theta series if it did exist 
(see [5, page 195]). In fact, the modular form comes from a distribution g as 
above, because it is a polynomial in the theta series of and the Leech lattice, 
and therefore comes from a g that is the corresponding linear combination of 
Poisson summation for direct sums of Es and the Leech lattice. If denotes 
the theta series of Eg , the Leech lattice, and the hypothetical 72 -dimensional 
lattice for n = 8,24,72, respectively, then 

79 3 1183 2 ^3 91 6 91 9 

= 1080^24 + 7^02408 - ^02408 - — Og. 
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Despite the minus signs, all the coefficients of B72 are non-negative. 

The most elegant form of the dual program comes from a rescaling analogous to 
that in Theorem 3.2 of [4]. Define a relaxed lattice to be a tempered distribution 
g such that g and g are of the form 



i>0 

with flj > for all i (not all ) , and = tq < ri < r2 < • • • . Call a relaxed 
lattice g self-dual iig = g. How large can ri be? 

Conjecture 4.2 In every dimension, the largest possible value of ri in a self- 
dual relaxed lattice equals the smallest value of r possible in Theorem 3.2 of 



One might imagine that the self-duality in Conjecture 4.2 would follow from 
some sort of symmetry of the linear programming problem, but that is not 
clear. If this conjecture is true, it would explain the otherwise remarkable fact 
that the minimal values of r in Proposition 7.1 and Theorem 3.2 of [4] always 
seem to agree (see Conjecture 7.2 in that paper). 
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